Coh-Metrix-Esp: A Complexity Analysis Tool for Documents Written in Spanish
نویسندگان
چکیده
Text Complexity Analysis is an useful task in Education. For example, it can help teachers select appropriate texts for their students according to their educational level. This task requires the analysis of several text features that people do mostly manually (e.g. syntactic complexity, words variety, etc.). In this paper, we present a tool useful for Complexity Analysis, called Coh-Metrix-Esp. This is the Spanish version of Coh-Metrix and is able to calculate 45 readability indices. We analyse how these indices behave in a corpus of “simple” and “complex” documents, and also use them as features in a complexity binary classifier for texts in Spanish. After some experiments with machine learning algorithms, we got 0.9 F-measure for a corpus that contains tales for kids and adults and 0.82 F-measure for a corpus with texts written for students of Spanish as a foreign language.
منابع مشابه
Using Coh-Metrix to Analyze Chinese ESL Learners’ Writing
Scoring essays is costly, laborious and time-consuming. Automated scoring of essays is a promising approach to face this challenge. Coh-Metrix is a computer tool that reports on cohesion, sentence complexity, lexical sophistication and other descriptive features at sentenceand paragraph-level. It has been widely used to analyze native English speakers’ essay writing. However, few studies have u...
متن کاملUnderstanding expert ratings of essay quality: Coh-Metrix analyses of first and second language writing
This article reviews recent studies in which human judgements of essay quality are assessed using Coh-Metrix, an automated text analysis tool. The goal of these studies is to better understand the relationship between linguistic features of essays and human judgements of writing quality. Coh-Metrix reports on a wide range of linguistic features, affording analyses of writing at various levels o...
متن کاملCohesion features in ESL reading: Comparing beginning, intermediate and advanced textbooks
This study of English as a second language (ESL) reading textbooks investigates cohesion in reading passages from 27 textbooks. The guiding research questions were whether and how cohesion differs across textbooks written for beginning, intermediate, and advanced second language readers. Using a computational tool called Coh-Metrix, textual features were compared across the three levels using M...
متن کاملA Comparison of Discourse Connective Identification of Coh-Metrix and the Penn Discourse Treebank
Coh-Metrix is a linguistic tool used by many researchers to quickly measure cohesion and coherence of text. Because it is a free, easy to use, and quite efficient linguistic tool, it is widely used in academic research and analysis. The results of many of these studies are dependent on the accuracy of the Coh-Metrix tool. I will be testing the accuracy of CohMetrix, focusing on its analysis of ...
متن کاملToward a New Readability: A Mixed Model Approach
This study is a preliminary examination into the use of Coh-Metrix, a computational tool that measures cohesion and text difficulty at various levels of language, discourse, and conceptual analysis, as a means of measuring English text readability. The study uses 3 Coh-Metrix variables to analyze 32 academic reading texts and their corresponding readability scores. The results show that two ind...
متن کامل